基于机器的最先进的模型是建筑物建模和预测能量行为的流行选择,因为给出了足够的数据,即使在复杂性禁止分析描述的情况下,它们也擅长查找时空模式和结构。但是,基于机器学习的模型用于构建能源预测的模型难以推广到数据中未表示的样本外场景,因为它们的体系结构通常不符合与能源传递现象相关的机械结构的物理对应。因此,他们对看不见的初始条件和边界条件的预测能力完全取决于数据中的代表性,这在构建测量数据中不能保证。因此,这些限制阻碍了它们对现实世界工程应用的应用,例如数字双胞胎的能源管理。作为回应,我们提出了一个域名适应框架,旨在利用对建筑物中能量行为的现象的众所周知的理解,以预测除建筑物测量数据之外的样本场景。更具体地说,我们使用低级别的线性时间不变状态空间模型表示能量行为的机理知识,然后利用其管理结构来预测目标能源系统,仅可用建筑物测量数据。我们通过使在物理衍生的子空间保持一致,该物理衍生的子空间控制全球状态空间行为更接近于测量数据的目标子空间。在最初的探索中,我们专注于线性能源系统。我们通过改变源和目标系统的热物理特性,以证明机械模型从物理学到测量数据的可传递性来测试基于子空间的DA框架。
translated by 谷歌翻译
Most research studying social determinants of health (SDoH) has focused on physician notes or structured elements of the electronic medical record (EMR). We hypothesize that clinical notes from social workers, whose role is to ameliorate social and economic factors, might provide a richer source of data on SDoH. We sought to perform topic modeling to identify robust topics of discussion within a large cohort of social work notes. We retrieved a diverse, deidentified corpus of 0.95 million clinical social work notes from 181,644 patients at the University of California, San Francisco. We used word frequency analysis and Latent Dirichlet Allocation (LDA) topic modeling analysis to characterize this corpus and identify potential topics of discussion. Word frequency analysis identified both medical and non-medical terms associated with specific ICD10 chapters. The LDA topic modeling analysis extracted 11 topics related to social determinants of health risk factors including financial status, abuse history, social support, risk of death, and mental health. In addition, the topic modeling approach captured the variation between different types of social work notes and across patients with different types of diseases or conditions. We demonstrated that social work notes contain rich, unique, and otherwise unobtainable information on an individual's SDoH.
translated by 谷歌翻译
We present a novel depth completion approach agnostic to the sparsity of depth points, that is very likely to vary in many practical applications. State-of-the-art approaches yield accurate results only when processing a specific density and distribution of input points, i.e. the one observed during training, narrowing their deployment in real use cases. On the contrary, our solution is robust to uneven distributions and extremely low densities never witnessed during training. Experimental results on standard indoor and outdoor benchmarks highlight the robustness of our framework, achieving accuracy comparable to state-of-the-art methods when tested with density and distribution equal to the training one while being much more accurate in the other cases. Our pretrained models and further material are available in our project page.
translated by 谷歌翻译
A backdoor attack places triggers in victims' deep learning models to enable a targeted misclassification at testing time. In general, triggers are fixed artifacts attached to samples, making backdoor attacks easy to spot. Only recently, a new trigger generation harder to detect has been proposed: the stylistic triggers that apply stylistic transformations to the input samples (e.g., a specific writing style). Currently, stylistic backdoor literature lacks a proper formalization of the attack, which is established in this paper. Moreover, most studies of stylistic triggers focus on text and images, while there is no understanding of whether they can work in sound. This work fills this gap. We propose JingleBack, the first stylistic backdoor attack based on audio transformations such as chorus and gain. Using 444 models in a speech classification task, we confirm the feasibility of stylistic triggers in audio, achieving 96% attack success.
translated by 谷歌翻译
Did you know that over 70 million of Dota2 players have their in-game data freely accessible? What if such data is used in malicious ways? This paper is the first to investigate such a problem. Motivated by the widespread popularity of video games, we propose the first threat model for Attribute Inference Attacks (AIA) in the Dota2 context. We explain how (and why) attackers can exploit the abundant public data in the Dota2 ecosystem to infer private information about its players. Due to lack of concrete evidence on the efficacy of our AIA, we empirically prove and assess their impact in reality. By conducting an extensive survey on $\sim$500 Dota2 players spanning over 26k matches, we verify whether a correlation exists between a player's Dota2 activity and their real-life. Then, after finding such a link ($p\!<\!0.01$ and $\rho>0.3$), we ethically perform diverse AIA. We leverage the capabilities of machine learning to infer real-life attributes of the respondents of our survey by using their publicly available in-game data. Our results show that, by applying domain expertise, some AIA can reach up to 98% precision and over 90% accuracy. This paper hence raises the alarm on a subtle, but concrete threat that can potentially affect the entire competitive gaming landscape. We alerted the developers of Dota2.
translated by 谷歌翻译
在过去的几年中,卷积神经网络(CNN)在各种现实世界的网络安全应用程序(例如网络和多媒体安全)中表现出了有希望的性能。但是,CNN结构的潜在脆弱性构成了主要的安全问题,因此不适合用于以安全为导向的应用程序,包括此类计算机网络。保护这些体系结构免受对抗性攻击,需要使用挑战性攻击的安全体系结构。在这项研究中,我们提出了一种基于合奏分类器的新型体系结构,该结构将1级分类(称为1C)的增强安全性与在没有攻击的情况下的传统2级分类(称为2C)的高性能结合在一起。我们的体系结构称为1.5级(Spritz-1.5c)分类器,并使用最终密度分类器,一个2C分类器(即CNNS)和两个并行1C分类器(即自动编码器)构造。在我们的实验中,我们通过在各种情况下考虑八次可能的对抗性攻击来评估我们提出的架构的鲁棒性。我们分别对2C和Spritz-1.5c体系结构进行了这些攻击。我们研究的实验结果表明,I-FGSM攻击对2C分类器的攻击成功率(ASR)是N-Baiot数据集训练的2C分类器的0.9900。相反,Spritz-1.5C分类器的ASR为0.0000。
translated by 谷歌翻译
在过去的几十年中,人工智能的兴起使我们有能力解决日常生活中最具挑战性的问题,例如癌症的预测和自主航行。但是,如果不保护对抗性攻击,这些应用程序可能不会可靠。此外,最近的作品表明,某些对抗性示例可以在不同的模型中转移。因此,至关重要的是避免通过抵抗对抗性操纵的强大模型进行这种可传递性。在本文中,我们提出了一种基于特征随机化的方法,该方法抵抗了八次针对测试阶段深度学习模型的对抗性攻击。我们的新方法包括改变目标网络分类器中的训练策略并选择随机特征样本。我们认为攻击者具有有限的知识和半知识条件,以进行最普遍的对抗性攻击。我们使用包括现实和合成攻击的众所周知的UNSW-NB15数据集评估了方法的鲁棒性。之后,我们证明我们的策略优于现有的最新方法,例如最强大的攻击,包括针对特定的对抗性攻击进行微调网络模型。最后,我们的实验结果表明,我们的方法可以确保目标网络并抵抗对抗性攻击的转移性超过60%。
translated by 谷歌翻译
当评估取决于微分方程的解决方案的兴趣量时,我们不可避免地要面对准确性和效率之间的权衡。特别是对于工程计算中的参数化,依赖时间的问题,通常情况下,可接受的计算预算限制了高保真性,准确的仿真数据的可用性。多保真替代建模已成为克服这一难度的有效策略。它的关键思想是利用许多低保真模拟数据,较少准确但要快得多,以改善有限的高保真数据。在这项工作中,我们引入了一个新型的数据驱动框架,该框架使用长期记忆(LSTM)网络,用于参数化的,时间依赖时间的问题,以增强对看不见的参数值的输出预测,并同时在时间上转发 - 一项已知对数据驱动模型特别具有挑战性的任务。我们证明了提出的方法在各种工程问题中具有通过精细和粗网格生成的高保真数据的各种工程问题,小时与大的时间步长,或有限的元素全订单全订单与深度学习降低的模型。数值结果表明,所提出的多效率LSTM网络不仅显着改善了单曲回归,而且还优于基于馈送前向神经网络的多效率模型。
translated by 谷歌翻译
受卷积神经网络(CNN)启发的图形神经网络(GNN)汇总了节点邻居的信息和结构信息,以获取节点分类,图形分类和链接预测的节点的表达性表示。先前的研究表明,GNN容易受到会员推理攻击(MIA)的攻击,这些攻击(MIAS)推断出节点是否在GNNS的训练数据中,并泄漏了节点的私人信息,例如患者的疾病史。以前的MIA的实现利用了模型的概率输出,如果GNN仅提供输入的预测标签(仅标签),则是不可行的。在本文中,我们在GNNS的柔性预测机制(例如,即使邻居的信息不可用,也可以获得一个节点的预测标签,借助GNNS的灵活预测机制,即使获得一个节点的预测标签,我们提出了针对GNNS的标签MIA。对于大多数数据集和GNN模型,我们的攻击方法实现了曲线(AUC)下60 \%的准确性,精度和区域,其中一些模型比我们在我们的下实施的基于最新概率的MIA具有竞争力甚至更好环境和设置。此外,我们分析了采样方法,模型选择方法和过度拟合水平对仅标签MIA攻击性能的影响。这两个因素都会影响攻击性能。然后,我们考虑有关对手的附加数据集(影子数据集)的假设以及有关目标模型的额外信息的情况。即使在这种情况下,我们仅使用标签的MIA在大多数情况下都能取得更好的攻击性能。最后,我们探讨了可能的防御能力,包括辍学,正则化,归一化和跳跃知识。这四个防御都没有完全阻止我们的攻击。
translated by 谷歌翻译
情绪识别涉及几个现实世界应用。随着可用方式的增加,对情绪的自动理解正在更准确地进行。多模式情感识别(MER)的成功主要依赖于监督的学习范式。但是,数据注释昂贵,耗时,并且由于情绪表达和感知取决于几个因素(例如,年龄,性别,文化),获得具有高可靠性的标签很难。由这些动机,我们专注于MER的无监督功能学习。我们考虑使用离散的情绪,并用作模式文本,音频和视觉。我们的方法是基于成对方式之间的对比损失,是MER文献中的第一次尝试。与现有的MER方法相比,我们的端到端特征学习方法具有几种差异(和优势):i)无监督,因此学习缺乏数据标记成本; ii)它不需要数据空间增强,模态对准,大量批量大小或时期; iii)它仅在推理时应用数据融合; iv)它不需要对情绪识别任务进行预训练的骨干。基准数据集上的实验表明,我们的方法优于MER中应用的几种基线方法和无监督的学习方法。特别是,它甚至超过了一些有监督的MER最先进的。
translated by 谷歌翻译